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I. INTRODUCTION AND BACKGROUND 



The Model Output Statistics (MOS) technique involves the 
processing of atmospheric parameters output from numerical 
weather prediction models (predictors), along with observed 
data, to produce forecast algorithms of meteorological 
parameters (predictands) . The predictands are either 
operationally important parameters not forecast by numerical 
models (e . g. , visibility , cloud amount, ceiling) or model 
output parameters whose predictive skills are improved 
( e . g. , surface wind, temperature) due to partial correction 
of numerical model bias and/or scale. 

The National Weather Service (NWS) uses a linear, 
least-squares regression model to generate empirical 
forecast equations. This MOS technique has demonstrated 
operationally usable skill in forecasting numerous weather 
elements at land locations throughout the world [Best and 
Pryor, 1983]. Both the United States Air Force and Navy 
have made limited use of the NWS model for selected land 
areas around the world. The Navy has attempted to forecast 
open-ocean fog and visibility using linear regression 
equations, with the resultant skill levels exceeding 
persistence, climatology and those of the NWS as well. 
However, these limited experiments produced results 
considered only marginally useful for operational situations 



12 



[Aldinger , 1979 Yavorsky, 1980; Selsor, 1980; Koziara, et 
al, 1983; Renard and Thompson, 1984], Undoubtedly, this 
performance level is due, in part, to the lack of 
’calibrated’ fog and visibility observations. At sea, 
weather observers lack the reference points necessary to 
accurately estimate the visibility. 

Because of the potential for success demonstrated by the 
above cited experiments, the Navy began development of an 
MOS program in the spring of 1983 to forecast operationally 
important air/ocean parameters over all ocean areas in both 
hemispheres. Horizontal visibility was selected as the 
first parameter to be investigated due to its importance to 
the mariner. Because linear regression techniques over land 
areas (NWS, 1960--date) and the North Pacific Ocean (Navy, 
late 1970’s) demonstrated considerably less-than-perf ect 
results, other statistical methods were proposed to 
determine if a better one could be found. 

Preisendorf er ( 1 98 3 a,b,c) proposed three strategies, 
two based on maximum probability and one based on natural 
regression. Lowe (1984a) proposed innovative threshold 
techniques to be applied with the linear regression 
approach. All of these methods were developed, applied and 
tested on North Pacific and North Atlantic Ocean areas by 
Karl (1984) and on additional North Atlantic Ocean areas by 
Diunizio (1984a) in their investigations of visibility. 
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Wooster (1984) applied the same techniques to cloud amount 
and ceiling height parameters. 

This study presents a Principal Discriminant Method 
(PDM) of statistical analysis as developed for the MOS 
problem by Preisendorfer (1984). Significance testing 
methods proposed by Mr. Paul Lowe, Naval Environmental 
Prediction Research Facility and investigated by Diunizio 
(1984b), were also utilized. These results are compared 
with results obtained from the aforementioned methodologies. 

In the following discussion, a sufficient number of 
terms and symbols are defined to allow readers without 
strong statistical backgrounds to understand the results. 
However, for a proper understanding of the Preisendorfer 
(1984) methodology, readers are encouraged to examine 
Appendix A for a detailed discussion. Details on the 
significance testing [Diunizio, 1984b] are found in Appendix 
B. 



Conversation, and unpublished notes. 
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II. OBJECTIVES AND APPROACH 



The objective of this study is to determine if the 
Principal Discriminant Method (PDM), applied to discrete 
values of model output and derived parameters, can improve 
upon the forecasting of horizontal marine 'atmospheric 
visibility when compared to the Preisendorf er natural 
regression and maximum probability approaches. The PDM 
approach is outlined as follows: 

a. define visibility groups, categorized in a way which 
relates most closely to operational use at sea. 

b. develop and apply the Preisendorfer (1984) PDM to 
three North Atlantic Ocean physically homogeneous 
areas [Lowe, 1984b], using 15 May through 7 July 1983 
Navy Operational Global Atmospheric Prediction System 
(NOGAPS) predictor data. 

c. compare and contrast the individual results with those 
Preisendorfer statistical methodologies previously 
explored by Karl (1984) and Diunizio (1984a). 

d. Based on a. to c. above, present an interim 
recommendation for an optimal statistical approach 
to forecasting horizontal visibility in the North 
Atlantic Ocean as a function of prediction time and 
homogeneous area. 
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III. DATA 



A. VISIBILITY OBSERVATIONS AND SYNOPTIC CODE 

Horizontal visibility observations taken from seagoing 
platforms are reported as values of ten standardized World 
Meteorological Organization (WMO) synoptic weather codes 
(Appendix C). These codes range in value from 90» which 
corresponds to visibility less than 50 m, to 99. which 
corresponds to visibility equal to or greater than 50 km. 
Human observational error and inexactness in measuring 
visibility at sea necessitate a reduction of visibility 
classification categories for prediction purposes. 

1 . Three-Category Case 

Initially, a three visibility category 
classification scheme was considered.. 

Visibility Category Synoptic Code 

I 90-94 

II 95-96 

III 97-99 
The above scheme is the same as that used by Karl (1984) 

and Diunizio (1984a); it is based upon the following at-sea 
operational criteria followed by the U. S. Navy. 

1. 10 km (5 n mi)--U.S. Navy aircraft carrier at-sea 

flight recovery operations change from visual (VFR) to 
controlled (IFR) approach guidelines [Department of the 
Navy, 1979]- 



Visibility Range 
< 2 km 

>2 km to <10 km 
>10 km 
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2. 2 km ( 1 n mi) --the sounding of reduced visibility 

signals for all vessels operating in international waters. 
The term "reduced visibility" is not specifically defined in 
the International Regulations for Preventing Collisions at 
Sea, 1972. The distance of 1 n mi is generally considered 
to be the governing operational distance. 

2 . Two-Category Case 

In the past [Renard and Thompson, 1984], forecasting 
skill for category II has proved to be minimal. In the 
preliminary work for this study, it was noted that the 
predictor means of all three category subsets, as a function 
of associated predictand values, were not always well 
separated. Without good separation, a good statistical 
forecast is not possible regardless of the method used. It 
was noted however, that even though not all three means were 
well separated, at least two of the means were well 
separated from each other. This finding suggested that a 
two-category case might be better supported by the data. If 
the two-category case showed better data support than the 
three-category case, then enhanced results might be 
expected. To test this hypothesis, two different 
two-category data sets were created for experimentation. 

The two cases are: 

Case X 

Visibility Category Synoptic Code Visibility Range 

IX 90-95 <4 km 

IIX 96-99 24 km 
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Case Y 



Visibility Category 


Synoptic Code 


Visibility Range 


IY 


90-94 


<2 km 


IIY 


95-99 


>2 km 



B. NORTH ATLANTIC OCEAN DATA 

1 . Area 

The North Atlantic Ocean, from 0° to 80° N latitude, 
was divided into homogeneous oceanic areas by Lowe (1984b), 
using a statistical cluster analysis technique. The 
homogeneous areas evaluated in this study are identified as 
areas 2, 3W and 4 which represent areas of moderate, 
frequent and sparse occurrences of poor visibility, 
respectively (Fig. 1). 

2 . Time Period 

Data from mid-May 1983 to mid-July 1983 were 
combined to form a more extensive data set, hereafter 
referred to as FATJUNE 1983* The FATJUNE period was 
selected as the initial data set for statistical 
experimentation because of the climatologically high 
frequency of occurrence of poor visibility observations for 
many areas of the North Atlantic Ocean during this period. 
Only the 1200 GMT synoptic ship report data, corresponding 

However, NOGAPS predictor data for the period 15 May--7 
July 1983 only were available for the study. 
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to daylight conditions, were used in this preliminary study 
of the method. 

For the purpose of this study, TAU-00 generally 
represents six-hour model forecast fields. However, 
temperature, geopotential height and wind are model 
initialization fields. TAU-24 and TAU-48 are defined as 
24-h and 48-h model forecast fields. All of the above are 
valid at 1200GMT. TAU-00, TAU-24 and TAU-48 model output 

parameters (predictors) are employed in the 00-h, 24-h and 
48-h forecast schemes, respectively. Summaries of the number 
of observations in each visibility category of the dependent 
and independent data sets, as a function of homogeneous area 
and prediction time for FATJUNE 1 98 3 » are contained in 
Tables I-IV. 

3 . Synoptic Weather Reports 

All synoptic visibility observations (predictand 
data) for this study were provided by the Naval Oceanography 
Command Detachment (NOCD), Asheville, North Carolina which 
is co-located with the National Climatic Data Center (NCDC). 
The observations which contained systematic observer error 
or were obviously erroneous, as determined from the data 
quality indicators provided with the data, were deleted from 
the working data sets. 

4 . Predictor Parameters 

Fifty TAU-00, fifty-four TAU-24 and fifty-four 
TAU-48 model output predictors (MOP’s) were provided by the 
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Fleet Numerical Oceanography Center (FNOC), Monterey, 
California. The parameters are generated by their current 
operational atmospheric prediction model, NOGAPS. All MOP's 
were interpolated from model grid coordinates to synoptic 
ship report positions using a linear interpolation scheme. 

In addition to the initial group of MOP's, thirteen derived 
parameters representing calculated quantities, such as 
parameter gradients and products, were included as potential 
predictors. Of the available predictor parameters, fifteen 
were eliminated from consideration because l) the MOP lacked 
a physical linkage to the visibility predictand, and/or 2) a 
lack of significant digits (lost during the transfer of the 
FNOC data to the main computer center mass storage system) 
rendered the particular MOP useless. A list of all TAU-00, 
TAU-24 and TAU-48 MOP's available to the experiments are 
included in Appendix E. 

C. DATA SETS 

1 . Standarization 

NOGAPS analysis/forecast parameters are output in a 
large variety of units/scales. To eliminate the effect of 
different units of the various predictors on the Principal 
Discriminant Method (particularly the part using principal 
component analysis), the data were standardized before the 
method was applied. Given x^,...,x members in each 
predictor group, the standardized members yi> ,,, >y n are 
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given by _ 

x. - x 

y •= — J 

J j s 



where 



x- 1 



n 



= AZ x j» the 



mean 



j = l 



.4 1 

ln -T=i 



Xj - X 



f] ■ 



the unbiased estimate of the 
standard deviation. 



In this way all units were removed, the data centered at 0, 
and the variance of each of the data sets became 1. 

2 . Dependent/independent Data Sets 

Since FATJUNE 1983 was the only data set available 
for this study, the data were divided into two groups. 
Approximately two-thirds of the data became the dependent 
set upon which the model was based. This set is also 
referred to as the training set . The remaining one-third of 
the data became the independent set on which the model was 
tested. This set is also referred to as the testing set . 

To insure that no biases existed in the sets, each 
training-testing set pair was created by use of a uniform 
random number generator. The given data sets were randomly 
split and then checked to insure they represented the 
initial population mean within a 95% confidence interval. 
Once created, these sets were used consistently throughout 
all model runs. 
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IV. PROCEDURES 



A. TERMS AND SYMBOLS 

The following terms and symbols are used throughout the 
remainder of this thesis and are briefly defined here to 
assist the reader. For more definitive mathematical 
expressions of potential errors, consult Appendix A. 
Mathematical expressions for class errors, threat scores and 
adjusted class errors may be found in Appendix D. 

1. A0--the estimated probability (based on actual 
predictions using the testing set) of a zero-class 
visibility category forecast error (e.g., if 
visibility category I is forecast, it is also 
observed ) . 

2. Al--the estimated probability (based on actual 
predictions using the testing set) of a one-class 
visibility category forecast error (e.g., if 
visibility category I is forecast and category 

II is observed) . 

3. A2 — the estimated probability (based on actual 
predictions using the testing set) of a two-class 
visibility category forecast error (e.g., if 
visibility category I is forecast and category III 
is observed). 

4. PA0--the estimated probability (based on the 
training set) of a zero-class visibility category 
forecast error. 

5 . PA1 — the estimated probability (based on the 
training set) of a one-class visibility category 
error. (PA2 is defined similarly.) 

6. Potential skill scores--(PAO,PAl above) may be 
interpreted as follows. Randomly partition a data 
set (such as FATJUNE 1983) many times into 
training-testing set pairs. Fit probability 
distributions to the category subsets of the 
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training set as described in PDM. Then produce 
PAO, PA1 values (using the training set) and 
actual AO, A1 values (using the testing set). 

Repeat this for all the training-testing set pairs. 
Take the average of all PAO values and all AO 
values. In the limit of a sufficiently large 
number of partitions of the data set, these 
averages will tend to agree. Similarly for PA1, 

Al* and PA2 , A2. 

7. Correlation coefficient — a numerical measure of 
the relationship between one predictor and another. 
The value of the correlation coefficient ranges 
from -1 for negative correlation to +1 for positive 
correlation. The larger the absolute value of the 
correlation coefficient, the more closely are the 
predictors correlated. 

8. P-value--the result of a two-sided significance 
test on separate variance t-test statistics. This 
gives a measure of the separation of the data into 
different visibility categories. 

9. TS1 — threat score for visibility category I 
computed from a contingency table. 

10. Maximum probability strategy — choosing forecast 
visibility category based upon the highest 
conditional probability of the predictand 
categories for a given a predictor interval. 

a. MAXPROB I--designation of a maximum probability 
strategy in which ties of the highest conditional 
probabilities in a predictor interval are resolved 
by the generation of a random number 

b. MAXPROB II — designation of a maximum probability 
strategy in which ties of the highest conditional 
probabilities for a given predictor interval are 
resolved by assigning the lowest visibility 
category, of those ties, as the forecast category. 

11. Natural regression strategy — choosing forecast 
visibility categories based upon the statistical 
average of the conditional probabilities of 
visibility for a given predictor interval. 

12. Functional dependence. This is a measure of the 
stochastic dependence of one predictor upon another. 
Functional dependence is an estimate of the 
probability that one of the predictors will change 
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when the other changes. High functional dependence 
values between one already selected predictor and 
another potential predictor indicates that little 
additional information beyond the selected predictor 
is possible. The specific derivation and 
mathematical description of the concept of 
"functional dependence" is discussed in greater 
depth by Preisendorf er (1983c). 

13- Root-sum-squared functional dependence. The 
functional dependence of a predictor on all 
predictors already included in the developmental 
model. It is equal to the square-root of the sum 
of the squares of the individual functional 
dependence values. 

14. AAO — adjusted AO. A contingency table statistic 
which removes the influence of the most frequent 
visibility category in a set of data (similar to 
a normalized value). 

15. CE — class error parameter defined as AO + 2A1 
used as the primary aid in identifying the first 
predictor in the Preisendorf er (1983a, b,c) PR models. 

16. PP--the potential predictability of visibility 
by any given predictor. 



B. COMPUTER PROGRAMS 

Four computer programs were developed to test the 
proposed Preisendorf er (1984) Principal Discriminant Method 
(PDM) methodology. The programs are on file in the 
Department of Meteorology, Naval Postgraduate School, 
Monterey, California 939^-3- 

1. A program to standardize the data and create 
training and testing sets for homogeneous areas, 
depending on whether the two-or three-category 
strategy was in use. 

2. A program to compute correlation coefficients between 
chosen and unchosen predictors, sorting them from 
low to high values. 
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3. A program to compute PAO, PA1 , AO and A1 values 
for each predictor and to check the PAO values for 
significance against chance. 

4. A program to compute PAO, PA1, AO and A1 values 

for two or more predictors using binary decomposition. 
This program also computes contingency tables 
and threat scores. 

C. PREISENDORFER PDM METHODOLOGY 

1 . Determination of the First Predictor 

Selecting the first predictor is a two-step process. 
The first step involves computing the initial statistics 
(PAO, PA1) for each predictor. Secondly, based on output 
from BMDP Statistical Software program P7D [University of 
California, 1983] >the average P-value for each predictor is 
computed and these values are ranked from low to high. The 
low values indicate better separability of the category 
populations. Therefore, the first predictor chosen is the 
one with the smallest averaged P-value. If more than one 
predictor shares the same low P-value, then of those 
predictors, the one with the highest PAO value is selected 
as the first predictor. 

2 . Choosing the Second and Subsequent Predictors 
The prospective second predictor in the model is 

determined from its correlation coefficient with the already 
chosen first predictor. The prospective second predictor 
has the smallest absolute value of the correlation 
coefficient. Whether it will ultimately be chosen as the 
second predictor depends on the following: 
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a. PAO has increased, and 

b. PA1 has decreased or remained constant, and 

c. the averaged P-value is significant, i.e., less 
than . 05 * 

If the prospective predictor cannot meet these criteria, 
then the next least correlated predictor is tried until all 
predictors have been exhausted. 

This process is repeated for the multi-predictor 
stage until the model is complete. 

3 . Terminating the Selection of Predictors 

Model development continues until any one of the 
following four conditions is met: 

a. no more predictors remain to be considered, or 

b. PAO and/or P-scores are no longer significant with 
respect to the null hypothesis, or 

c. criteria required to add additional predictors 
cannot be met. 

Once the model development is complete, actual 
zero-and one-class errors (A0,A1) are computed using the 
independent data set. The resulting PAO, PA1 , AO and A1 
values provide the measurement statistics on which the 
usefulness of the model is based. 

D. PREISENDORFER (PR) MODEL 

This model represents the application of the 
Preisendorfer ( 1983 a, b,c) methodology (PR) explored by 
Karl (1984) and Diunizio (1984a). Karl's study provides 
specific details on the method and readers interested in a 
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more thorough presentation may consult it. This discussion 
is presented as a prelude to comparing results of the PR 
model to the Preisendorf er (1984) PDM model of this study. 

As with the PDM model, the PR model utilizes NOGAPS 
model output and derived parameters as potential predictors 
in constructing a developmental model, based upon the 
dependent training data set, which provides the structure by 
which the model is tested and evaluated. (However, as 
applied by Karl and Diunizio, the data sets were not formed 
randomly nor were the means of the sets constrained to be 
representative of the entire population from which they were 
drawn. Instead, the visibility category groups were 
constrained to show similar percentages, for both the 
independent and dependent data sets.) The range of values 
of these predictors is partitioned into discretized equally 
populous predictor intervals ("cells") and conditional 
probabilities of the predictand are calculated according to 
the three previously defined VISCAT's. There are three 
separate strategies for determining the VISCAT to be 
identified with each predictor value. These strategies are 
MAXPROB I and MAXPROB II based on maximum probability, and a 
natural regression approach. 

The sizes of the equally populous predictor intervals 
are varied from four to ten. An optimal first predictor is 
selected, which meets (in order) one of the following 
requirements : 
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a. the lowest CE value of all the potential predictors, 
or 

b. the highest PP value of all potential predictors. 

After selecting a first predictor for each of the 

equally populous intervals, the corresponding VISCAT I, II 
and III threat and AO scores are calculated for both 
dependent and independent data sets from the MAXPROB II 
strategy. Then the optimal equally populous predictor 
interval is selected such that it is the smallest interval 
to maximize the dependent data set’s adjusted AO and 
independent data set’s adjusted VISCAT I threat score 
(Appendix D) . 

Next, a functional dependence test -of the first 
predictor against the remaining potential predictors is run. 
Subsequent predictors are selected only if: 

a. the AO value increased over that at the preceding 
level, and 

b. the selected predictor must have the lowest 
functional dependence and root-sum-square functional 
dependence of all the remaining potential predictors. 

After completing the predictor selection stage, Monte 
Carlo significance testing is performed to see if the 
results are significant compared to random chance. 

Functional dependence/root-sum-square functional dependence, 
AO and A1 statistics are calculated for 100 randomly 
generated sets to determine the 5 and 96 percentile points 
of A1 and AO, denoted as ’Al(05)', ’A0(96)', respectively. 
The developmental model results are considered to be 



28 



significant if: 

a. AO is greater than or equal to A0(96), and 

b. A1 is less than or equal to Al(05)» and 

c. the functional dependence value for a selected 
predictor is less than functional dependence 96 
percentile level FD(96) (determined by the Monte 
Carlo procedure, above). 

Model development continues until the fifth predictor 
level when computer storage limitations preclude further 
addition of predictors. Once complete, contingency tables 
of forecast versus observed visibility category are 
constructed for both dependent and independent data sets. 
Threat and skill scores are computed and compared. 

E. THE PDM VS. PR METHODS 

The PDM method and the PR methods can be shown to be 
equivalent in the discrete setting; it is in the 
non-discrete setting that they differ by virtue of fitting 
one or the other with analytic versions of discrete 
probabilities. The MAXPROB approaches make a prediction 
based on the probability distribution of the categories for 
a given predictor value, whereas the PDM method 
discriminates between the probabilities of the categories in 
a predictor space. In the PDM method, analytic functions 
are fitted to the category subsets of predictor space and 
comparisons are made between these probabilities at each 
given predictor value. Thus, more continuous information is 
available when the data are sparse in this method than with 



29 



the MAXPROB approach (although the latter, too, may be 
fitted with analytic probability models). Both of these 
methods should have an advantage over more traditional 
linear regression techniques whenever the data shows 
nonlinear rather than linear trends over the predictor 
space, since these methods would tend to follow the curve of 
the data, instead of trying to fit a straight line (or 
hyperplane) to them. The more predictor categories that are 
used and the more nonlinear the predictand/predictor 
relation, the greater, is the anticipated advantage of PDM 
over linear regression. 
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V. RESULTS 



The results of the Principal Discriminant Method (PDM) 
experiments, as outlined in Chapter IV and Appendix A, are 
presented herein. They are arranged by oceanic homogeneous 
area and model output period. Fig. 1 displays the 
individual oceanic homogeneous areas for FATJUNE 1983 • 

Tables I through IV identify the number of observations in 
each visibility category by prediction interval (i.e., TAU) 
and homogeneous area. 

The results are further clarified by the corresponding 
figures in Appendix G, which provide comparisons of PAO and 
dependent and independent AO scores versus the number of 
predictors chosen for that particular data set. The models 
for each set terminated due to established model constraints 
and not due to computer system storage restrictions. Note 
that dependent AO scores were not available at the first 
predictor level due to programming time and constraints. 
Future experiments could include this information. Thus, 
the dependent AO data start with the second predictor. The 
chosen predictors are listed in the order of selection. 
Contingency tables resulting after the selection of the 
final predictor are included for both dependent and 
independent sets. In general, independent AO (testing) 
scores are lower than the dependent (training) AO scores. 
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Even though the training and testing sets are representative 
of the same population, their points are scattered 
differently. This difference, in general, leads to a 
decrease in the AO scores from the dependent (training) set 
to the independent (testing) set. However, in a number of 
cases, the independent AO score is higher than the PAO score 
at the first predictor level. Likewise, the dependent AO 
score is higher than the PAO score at the second predictor 
level for some cases. Although, on average, one would 
expect the reverse to be true, the scatter of the individual 
test scores could occasionally lead to higher AO scores than 
PAO scores. The steady decline of AO scores for the first 
few predictors is also a common occurrence. While the PAO 
score continues its steady ascent (as required by the method 
to justify the addition of the next predictor) the AO scores 
shows more erratic behavior, exhibiting the instability of 
the method. However, when the criteria test was changed, as 
will be described later, the resulting AO values show a 
closer relationship to the PAO scores and hence greater 
stability. Stability is desired in a model or else its 
forecasts are of little value. To determine exactly why the 
method is stable or unstable, carefully controlled 
experiments would have to be performed with artificial data 
sets . 

When comparing the results of the PDM model to the 
maximum probability and natural regression strategies of the 
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PR models, it was noted that the PR models provided higher 
scores in almost every case for all scoring techniques. 

This difference may be due to the composition of the data 
sets themselves, the separation of the data into 
training-testing pairs, some aspect of the methodology or a 
programming error. However, without conducting experiments 
on carefully constructed artificial data sets it would be 
impossible at this point to state a conclusive reason for 
the difference. One PDM experiment which was conducted at 
the end of the research did give comparable results to the 
PR methods and will be discussed in more detail later as 
will any other exceptions to the general finding stated 
above. Specific numerical values from the work of Karl 
(1984) and Diunizio (1984a), along with the corresponding 
PDM results, are presented in Table V. 

A. NORTH ATLANTIC OCEAN, AREA 2 

Area 2 encompasses a geographic region extending from 
the southeastern tip of Newfoundland, across the North 
Atlantic Ocean to the eastern coast of England, 
north/northeast to include most of Iceland, and back to the 
Canadian coast north of Newfoundland (Fig. 1). 

1 . Area 2, TAU-00 

Results for this case are shown in Fig. 6. Five 
predictors were selected. The dependent AO score rises 
slightly between predictors two and three, and then roughly 
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parallels the independent AO score. The independent AO 
score does not show an increase over its initial value until 
the addition of the fifth predictor. The PDM model 
outperforms the PR model in the following scores (Table V) : 

a. TS2 scores for both dependent and independent sets 
are higher than for either MAXPROB I or MAXPROB II, 
thus showing better skill in forecasting VISCAT II. 

b. The TS12 score is higher for the dependent set 
than either MAXPROB I or MAXPROB II. 

2. Area 2, TAU-24 

A variety of experiments were performed on this 
case. In addition to the standard application of the PDM 
techniques afforded the other cases, two additional 
three-category experiments and a two-category experiment 
were performed, as detailed in paragraphs a, b, and c below, 
a. Set Composition Experiments 

To determine the effect of the random 
composition of training/testing set pairs on the results, 
three distinct sets (2(A,B,C)) were created. 

Sets 2(A,B,C) follow a similar pattern for the 
PAO scores, except that sets 2(B) and 2(C) could not support 
more than five predictors, while the model for set 2(A) 
finally terminated with the seventh predictor (Fig. 7). The 
first five predictors were the same in all three cases. The 
dependent AO scores (Fig. 8) follow a different pattern in 
each case (one (A) declining to predictor four and then 
increasing and decreasing once more, one (B) declining 
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steadily, and one (C) declining to predictor three and then 
increasing and decreasing once more) which is fairly well 
paralleled by the independent AO scores (Fig. 9)* Ideally, 
the curves in Figs. 8 and 9 should be as closely spaced as 
those in Fig. ?• Presumably, these scattered curves are 
showing giving information about the noise inherent in the 
observed visibility data sets. Also, they may indicate that 
PAO, PA1 must be redefined so that they may more 
realistically anticipate these scatterings of the AO, A1 
scores. Separate figures for each set are found in Figs. 

10, 11 and 1 2 . 

The PDM vs. PR results found for area 2, TAU-00 
hold true at TAU-24- also. 

b. Criteria Experiments 

To determine the effect of altering the criteria 
for splitting data swarms in predictor space during the 
decomposition phase, two methods were tried. The first 
entailed changing the critical A value from A ( 96 ) to A(98). 
The second eliminated Monte Carlo methods entirely and 
created a new value, A', where A’ is the ratio of the 
largest eigenvalue (associated with the data swarm’s 
covariance matrix) to the average of the remaining 
eigenvalues. For the A' experiments, the set was split 
if A’>2. 

The criteria tests, which were all performed on set 
2(A), show that changing the critical value from A(96) 
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to A(98) leaves the PAO pattern basically unchanged. (Note 
that the same seven predictors were used in each of the 
criteria experiments.) The pattern for the A' curve is much 
different, exhibiting a slower rise to the sixth predictor 
and then a large jump at the end, surpassing the results of 
the other criteria tests (Fig. 13)* The major difference is 
in the behavior of the dependent and independent AO scores 
(Figs. 14 and 15)* The A(96) test curve in Fig. 14 shows a 
sharp decline in the dependent AO score to the fourth 
predictor, an even sharper rise at the fifth predictor and 
decline thereafter. These scores are mirrored by the lower 
scoring independent AO’s in Fig. 15- Both sets of scores 
are considerably less than the PAO scores of Fig. 13* 

The A(98) test gives a more stable version of the same 
pattern. Unlike the first two tests, the A' test produces 
dependent AO scores in Fig. 14 quite similar to Fig. 13' s 
PAO score through predictor six, with independent scores 
following a roughly similar pattern without a major loss in 
zero-error skill. These results show much greater stability 
than for any other experiments conducted and thus show the 
most promise for a continued investigation of the PDM 
method. The dependent and independent AO scores for the A' 
version of PDM are comparable to those of the PR methods. 
Some of the skill scores are higher for PDM and some for PR. 
The point here is that "fine tuning" the criteria cut-off 
(from A' >2.0 to some other value) could result in superior 
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scores overall. The A(98) and A' cases are treated 
individually in Figs. 16 and 1 7 . Once again, in comparing 
the curves in Figs. 16 and 17 » we see that the A' version of 
PDM produces much more stable and somewhat higher scores 
than the A(98) version. 

c. Two-Category Experiments 

The two-category cases (Chapter III. A. 2) provide 
quite different final results when compared with each other, 
even though the general pattern was not much different 
between Case X and Case Y(A). (Note that there are three 
versions of Case Y, i.e., Y(A,B,C). All comparisons -between 
Cases X and Y were done with the Y(A) data set.) The 
results for Case X are shown in Fig. 18. The model 
terminated at the fifth predictor level with the same five 
predictors as for the other Area 2, TAU-24 cases. Both 
dependent and independent AO scores decline through 
predictor number three, then slowly increase to a level much 
below their respective initial AO scores. 

The three Case Y sets exhibit the same 
similarities in the PAO scores as the three-category cases 
(Fig. 19)* Again, the AO scores (Figs. 20 and 21) tend to 
show a pattern of decline followed by increasing scores. 
However, in these two-category cases, the independent AO 
scores are very close to the dependent AO scores and they 
are considerably higher than for the the three-category 
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cases. Individual results for each case are presented in 
Figs. 22, 23 and 24. 

Case X does not show AO scores appreciably 
higher than the AO scores from the three-category case. 

This might indicate that the data do not support the Case X 
VISCAT divisions any more than for the three-category case. 
However, Case Y does show significantly higher AO scores 
than the three-category case. This is more in line with the 
expected result; expected since it ought to be easier to 
forecast for two categories than for three under any 
circumstances. This result seems to indicate that the Case 
Y VISCAT divisions are supported by the available data, 
i.e., that it may be more feasible for on-board observers to 
discern between less than or greater than 2 km visibility, 
than less than or greater than 4 km visibility. 

3. Area 2, TAU-48 

Results for this case are shown in Fig. 25 . Four 
predictors were selected. The dependent AO scores stay 
virtually constant for all predictors. The independent AO 
scores show a continuous decline with each additional 
predictor. The PDM shows higher dependent threat scores and 
an independent TS2 score when compared to MAXPROB I. 

B. NORTH ATLANTIC OCEAN, AREA 3W 

Area 3 W borders the United States' eastern seaboard from 
the vicinity of Cape Charles, Virginia to the southeastern 



38 



tip of Newfoundland. The area encompasses a large portion 

0 

of the Georges Banks region and extends to approximately 45 
W longitude (Fig. 1). 

1 . Area 3W, TAU-00 

Results for this case are shown in Fig. 26 . Five 
predictors were chosen. Once again, both dependent and 
independent AO scores decline until the addition of the 
fifth predictor, at which point they surpass their initial 
values . 

2. Area 3 W, TAU-24 

The results for this case are shown in Fig. 27. 

Three predictors were selected. In this case, the 
independent AO score increase with the addition of each 
predictor, while the dependent AO score decreases. This 
pattern is not seen in any other case. 

3. Area 3W, TAU-48 

The results for this case are shown in Fig. 28. 

Seven predictors were chosen. The dependent and independent 
AO scores decline until the addition of the fifth predictor 
where they reach their maximums. The sixth predictor shows 
another decline and the seventh an increase. The 
independent TS2 score in the PDM model equals that of the PR 
model. 



39 



C. NORTH ATLANTIC OCEAN, AREA 4 

Area 4 encompasses a broad region of the North Atlantic 
Ocean which is generally to the south of area 2 and east and 
southeast of area 3 W. This area's southern border reaches 
to the northeastern tip of Portugal and extends northward 
through the English Channel to encompass the southern 
portion of the North Sea (Fig. l). 

1 . Area 4, TAU-00 

The results for this case are shown in Fig. 29. 

Only two predictors were chosen for this model. The 
independent AO score declines after the addition of the 
second predictor. The dependent AO score shows no trend 
since only one value was available. 

2. Area 4, TAU-24 

The results for this case are shown in Fig. 30* 

Four predictors were chosen. The dependent and independent 
AO scores declined until the addition of the fourth 
predictor. At that point, they are larger than at their 
initial predictor stage. The dependent and independent 
results are almost identical which is a rare result. The 
PDM model shows higher scores for all dependent threat 
scores compared to MAXPROB I and the independent TS2 score 
from MAXPROB I. 

3. Area 4, TAU-48 

The results for this case are shown in Fig. Jl. Two 
predictors were chosen. The independent AO score increases 
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from the first to the second predictor. No trend is 
available for the dependent AO score. 
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VI. CONCLUSIONS AND RECOMMENDATIONS 



A. CONCLUSIONS 

The primary objective of this study is to evaluate the 
Principal Discriminant Method (PDM) [Preisendorf er , 1984], 
to compare those results to the maximum probability and 
natural regression schemes [Preisendorfer , 1983 a, b,c] 
examined by Karl (1984) and Diunizio (1984a), and to propose 
a viable statistical forecasting scheme suitable for 
eventual employment in an operational U. S. Navy marine 
visibility MOS forecasting system. In general, the PDM 
model, using the A(96) criteria for decomposing sets, was 
outperformed in all measures of effectiveness by all of the 
PR schemes. 

However, the version of the PDM model which used the A' 
criterion for splitting predictor category sets during 
decomposition showed very promising results (cf . , 2b in V.A. 
and Table V). The A0/A1 scores for both dependent and 
independent sets were very close to those of the PR models. 
Perhaps, this is because the A' criterion is a better judge 
of the geometry of the data sets than the Monte Carlo A(96) 
criterion. The result is that the information contained in 
the data set is more readily available in the A' method than 
for the other predictor space category splitting methods. 
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B. RECOMMENDATIONS 



The following recommendations are offered to future 
researchers : 

1. The decision criteria for splitting data swarms 

in the decomposition phase need further examination. 
Indeed, it is the novel use of principal component 
analysis for this purpose that distinguishes the 
present discriminant method from other such methods 
in the literature. The A' criterion appears to be 
a step in the right direction (note 2b in V.A.). 
Further research should center on determining the 
best value against which to test the A value, or 
still other ways of splitting the overly-elongated 
category subsets of predictor space. 

2. Create carefully controlled artificial data sets 
on which to apply all of the Preisendorf er models 
(MAXPROB, natural regression, PDM) to determine where 
and why they break down or excel. Also, using the 
same artificial data, simultaneously test regression, 
especially linear, along with the various threshold 
models . 

3. Remove from further consideration the A(96) and A(98) 
critical score criteria in the decomposition phase 

of the model. 

4. Test the PDM model, using the entire FATJUNE 1983 
data set as the training set and the entire FATJUNE 
1984 data set as the testing set, or vice versa. 

5 . Use winter data for a set of experiments to 
determine if the results are similar to that of 
the summer season. 

6. Use a night-time data set (0000 GMT) in the 

North Atlantic area to test the expected deterioration 
of all schemes relative to daytime conditions. 
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APPENDIX A 



A DISCUSSION OF THE STATISTICAL PROCEDURES PROPOSED BY 
PREISENDORFER (1984) FOR THE FORECASTING OF 
ATMOSPHERIC MARINE HORIZONTAL VISIBILITY USING 
MODEL OUTPUT STATISTICS 



I. INTRODUCTION 



The following discussion is based upon an unpublished 
note by Preisendorf er (1984). The note develops the 
Principal Discriminant Method (PDM) of forecasting and 
suggests how to link the output of numerical weather 
prediction model output parameters with observed fields to 
produce model output statistics (MOS) prediction schemes. 
The application of his methodology to MOS forecasting is as 
follows : 

1. Generate suitably lagged predictand/predictor 
pairs of data. The predictors are drawn from the 
United States Navy Fleet Numerical Oceanography 
Center's Navy Operational Global Atmospheric 
Prediction System (NOGAPS) model output. The 
predictands are drawn from synoptic ship visibility 
observations provided by the Naval Oceanography 
Command Detachment, Asheville, North Carolina. 

2. Separate the predictand data into visibility 
categories. Construct predictand/predictor pairs 
based on the predictand visibility category values. 
Partition the space of predictor values into 
category subsets. 

3. Fit a probability density function to the category 
subsets of predictor space. This task is facilitated 
by using a succession of principal component analyses 
of the category sets in predictor space. 

4. Based on the probability density functions for the 
training set, find the potential class errors, PAO, 
PA1 . 
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5. Based on the probability density functions and 
utilizing testing set data, find the actual class 
errors, AO, Al. 

6. Pick as the first predictor the one with the smallest 
averaged P-value (a measure of separation between 
two probability density functions) and largest PAO 
value , 

?. Correlate a potential predictor with the set of 

already selected predictors, selecting as the next 
predictor the one which is least correlated with 
the already-selected set. 

8. Repeat steps 1-5 and 7 until all predictors are 
chosen. 



II. SINGLE PREDICTOR STAGE 

A. THE PREDICTOR/PREDICTAND PAIR “ 

For each individual data point I (I=1,NTRN, the number 
of points in the training set) there is a predictand value 
NTRPY(I) and its corresponding predictor values TRNPX(I.KX) 
where KX=1,KP, the total number of predictors under 
consideration. For this study, the NTRPY(I)'s represent 
visibility while the TRNPX(I.KX) may be, e.g., the vapor 
pressure at 925 mb, or surface moisture flux, etc. 

B. THE DISCRIMINANT SET 

The discriminant diagram, Fig. 2a, for the data, shows 
histograms indicating to which predictand category a given 



’"‘The notation herein follows that of the corresponding 
computer code. 
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predictor value is assigned. Thus the triangles are for 
category 1, circles for category 2, squares for category 3* 
In Fig. 2b, these histograms are separated vertically to 
form the discriminant set. As the points in the data set 
are considered (i.e., as I changes), the TRNPX(I,KX) value 
moves irregularly about on the horizontal axis while the 
corresponding NTRPY(I) moves similarly among the three 
levels of the vertical axis now occupying category 1, then 
category 3» and so on. A point pair is shown in an 
instantaneous position in Fig. 2b. There are NTRN such 
pairs in the dependent (training) discriminant set and NTST 
such pairs in the independent (testing) discriminant set. 

The diagram in Fig. 2b stands for either of these two 
discriminant sets. 

C. CATEGORY SUBSETS OF PREDICTOR SPACE 

Looking at the discriminant set (Fig. 2b), notice the 
subset of predictor points associated with category one. 

This is XCAT1 ( I 1 , KX) , the rightmost pairs of points on the 
first level (which is simply a copy of the horizontal axis). 
Similarly, XCAT2(I2,KX) contains the middle pairs and 
XCAT3(I3>KX) the leftmost pairs. Each predictor point of 
the training set is assigned to a predictand in a particular 
category. Thus predictors corresponding to the training 
predictand values in categories 1,2 or 3 are assigned to 
XCAT1, XCAT2 or XCAT3 respectively. II, 12 and 13 represent 
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the index of values in the respective categories. KX 
identifies the predictor (e.g., vapor pressure at 925 mb., 
etc . ) . 

D. FITTING THE PROBABILITY DENSITY FUNCTION 

For this study, the Gaussian probability density 
function (PDF) was chosen to be fitted to the category 
subsets of the predictor space. However, one might consider 
using other PDF's if they were more suitable for a given 
data set. 

The one dimensional Gaussian PDF for category J is: 
PHIJ=(2nr) “ ® * ( SIG J ) " 1 *EXP ( -0 . 5 ( ( X-AVGJ ) **2/VARJ ) 
where J=l,2,3> and 
AVGJ=average 
SIGJ=standard deviation 
VARJ=variance 

of the set of points defined by XCATJ( IJ, KX) , IJ=1,NXJ for 
each predictor indexed by KX. The fitted curves may appear 
as in Fig. 2c. 

E. CLASS ERRORS 

An indication of how well a prediction method is doing 
is to count the number of predictions that are correct 
(zero-class errors) and the number of predictions that are 
off by one category (one-class errors). This is done two 
ways. The potential zero-and one-class errors, PAO and PA1, 
are determined using the probability functions fitted to the 
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category subsets of the training set. The actual zero-and 
one-class errors, AO and Al, are determined using the 
testing set. 

1 . Finding the probabilities 

Form the array 

ANU(M, J)=PHIJ(TRNPX(M,KX) ) , KX fixed, 
where J=1 ,2,3 

TRNPX is the training set function which assigns 

to (M,KX), a predictor value 

M=1,...,NTRN the indexes of points in the training set 

KX=1,...,KP the set of predictors' indexes 

Let ^ 

SNU(M)=^ANU(M, J) 

J=1 

and define the probabilities: 

PRB(M, J )=ANU(M, J)/SNU(M) . 

2 . Finding PA0,PA1 

Find the maximum of the set of probabilities 
PRB(M,J),J=1,3. Let this be PRB(M,J(M)) for each 
M= 1 , NTRN . For example, if of PRB(M,1), PRB(M,2) and PRB(M,3), 
the maximum value occurs for PRB(M,2), then J(M)=2. In 
practice, this would result in predicting category 2. Then 
define 

NTRN 

PA0= NTlN“^ PRB(M ’ J(M)) 

NTRN 

PA1= ITTlir2tAPRB(M,J(M)) + APRB (M, J(M) +2 ) ] 

PA2= 1 - (PAO + PA1 ) 
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where APRB(M, 1 )=0 

APRB(M,2)=PRB(M, 1) 

APRB ( M , 3 ) =PRB ( M , 2 ) 

APRB(M,4)=PRB(M,3) 

APRB (M, 5 ) =0 

The APRB arrays allow for easier calculation of PA1 since 
array indexing does not allow for PRB(M,0) terms, for 
example . 

The higher the PAO values and the lower the PA1 values, 
the potentially better the predictor PX(I,KX) may predict 
PY(I). Therefore, a potentially good predictand-predictor 
pair has large PAO and small PA1 values. 

3 . Finding A0,A1 

AO and A1 are the actual zero- and one-class errors 
produced by the model when the predictor values of the 
testing set are given to the previously established 
probability density functions, i.e., the PHIJ's. Using the 
same strategy as for PA0,PA1 make a prediction for the 
predictand value and then compare it with the actual 
predictand value from the testing set. With each correct 
prediction or one-class error, the totals of AO or A1 
increase by one unit, respectively: 

A0=( 1/NTST) (TOTAL ZERO-CLASS ERRORS) 

Al=( l/NTST) (TOTAL ONE-CLASS ERRORS) 



49 



F. SCREENING AND RANKING CATEGORY SUBSETS 



1 . Separability of Category Subsets 

Unless the category subsets are well separated from 
each other, the predictions will not have much skill? As a 
measure of separability, the P-statistic of each distinct 
pair of categories is found for each predictor using BMDP 
Statistical Software program P 7 D [University of California, 
1983]. For the three-category case at hand, this provides 
three P-values which are then averaged to provide a single 
mean P-value for each predictor. These are then ranked 
smallest to largest: the smaller the value the better 
separated are the data heaps in the category subsets. The 
first chosen predictor is thus the one with the smallest 
mean P-value. (Other measures of separability exist. See, 
e.g., the potential predictability (PP) measure in 
Preisendorf er (1983a). 

2 . PAO scores 

In the event that more than one predictor has the 
smallest mean P-value, then the first predictor is chosen by 
selecting from among those predictors the one with the 
largest PAO score. 



’"Unless category set separation is present, it is 
unlikely that any method of prediction of the given 
predictand from the given predictors will be skillful. It 
is this feature of the prediction problem that discriminant 
methods, such as the present one, isolate most clearly: if 

category probability density curves are well separated and 
the training set is representative of the data set, then 
high forecast skill is assured. 
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III. MULTIPLE PREDICTOR STAGE 



A. CORRELATIONAL SCREENING OF PREDICTORS 

Suppose we have K-l predictors selected, where K=2,..,KP 
and KP is the total number of predictors. Let the selected 
predictors be TRNPX(I,KX), KX=1,..,K-1. So, if there is one 
chosen predictor we have TRNPX(I,1), I=1,NTRN. Let the 
remaining set of predictors be denoted TRNPW(I,KW), 

KW=1 L where L=KP + 1 - K. Let C0RR[KW,KX] denote the 

correlation between the KXth chosen predictor and the KWth 
unchosen predictor. Since the correlation is a measure of 
the distance between a chosen and an unchosen predictor, we 
are looking for the smallest value of the correlation since 
a smaller value indicates that the unchosen predictor is 
farther from (i.e., less dependent on) a chosen predictor. 

Therefore, let 

C(KW)=max JABS CORR [KW,KX]} 

This gives (an inverse) measure of the distance of the KWth 
potential predictor from the set of chosen predictors. The 
smaller C(KW) is, the larger the distance. The next 
predictor chosen for consideration is the one with the 
smallest C(KW) value. 
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B. THE K-DIMENSIONAL DISCRIMINANT SET 



Once the Kth predictor is added, there are two sets, one 
training and one testing, of K predictors in the form of a 
vector 

VECPX ( I ) = ( PX ( I , 1 ) PX( I , K ) ) 

in the Euclidean K-space E . As index I changes, VECPX ( I ) 

K 

moves about in E , as does the predictand array NTRPY(I). 

K 

The set of all ordered pairs ( VECPX ( I ) , NTRPY ( I ) ) , I=1,NTRN 
is the present discriminant training set and is a general 
(k-stage) version of section II. B. above. 

( VECTS ( I ) , NTSPY ( I ) ) , 1=1 , NTST is the discriminant testing 
set . 



C. CATEGORY SUBSETS OF PREDICTOR SPACE 

As in II. C. above, category subsets of the K-dimensional 

predictor training vector are formed, based on the value of 

the associated predictand value. The net result is three 

subsets of E defined by the three swarms of points 
K 

| XCAT J ( I ) : 1=1, NXJ} 

where J=l,2,3* and NXJ is the number of points in the 

subset. Here XCATJ ( I ) =[XCATJ (1,1) XCATJ(K.I) ] T . On 

these three subsets of E we fit the K-dimensional Gaussian 
PDF’s. However, these data swarms are not usually 
distributed normally which brings about the next step. 
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D. BINARY PRINCIPAL DECOMPOSITION OF THE CATEGORY SUBSETS 

Let X = jXCATJ ( I ) : 1=1 , . . . ,NXJ} , J=l,2,3, be the Jth 

category subset of E^. A general picture of X^ for the case 

K=2 is in Fig. 3* The shape of X may possibly be elongated 

J 

and curvilinear. Since the most variance of the subset is 

along the unit vector e_ at 0, this suggests that we form a 

principal component decomposition of the swarm X of NXJ 

J 

points in E where J=l,2,3* The principal component 
K 

decomposition is well-suited to find this direction e^ of 

greatest variance. This is done in the following steps. 

1. Recall that in III.C., the data sets forming the 

predictors were standardized. Next, go on to find the 

centroid AMEANJ of each XCATJ in E . This is the centroid 

K 

point shown as 0 in Fig. 3* By definition, 

AMEANJ =-4-^ XCATJ ( I ) 

NXJ 



The Lth component of AMEANJ is 

AMEANJ ( L ) XCATJ ( I , L ) 

NXJ 



where L=1 , . . . ,K. 



2. Form the covariance matrix SJ of the X data swarm. 

J 

Thus, first center the points XCATJ ( I ) on the mean AMEANJ of 

V 

XJ ( I ) h XCATJ ( I ) - AMEANJ, 1=1,..., NXJ 



i.e., in component form 



XJ ( I , L ) =XC AT J ( I , L ) - AMEANJ ( L) 

1 = 1 , . . . , NXJ ,* L= 1 K 

Then the entry SJ(L,M) of SJ in its Lth row and Mth column 
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sj(l,m)= nxj ^ X xj ( i > l ) xj ( i » m ) 

for L,M=1,...,K and for categories J=l,2,3* 

3. Find the eigenvalues and eigenvectors of the 
covariance matrix SJ(L,M). Sort the eigenvalues from high to 
low, and arrange their corresponding eigenvectors similarly. 

4. Compute A(I), the principal components from 

P4 

A(J)=2.XJ(I,L)e (L) 

t=i 1 

where e^ = [e^( 1 ) , . . . ,e (NXJ) ] T is the eigenvector 
corresponding to the largest eigenvalue of the data swarm 
currently under consideration. 

The following steps describe how the above information 
is used in decomposing the data swarms to a terminal state 
for use in the multi-predictor PDF's. See Fig. 3 for level 
0 of the splitting procedure. 

5 . Decision to split subsets at level Is 

a. For K predictors and a set Xj ( , . . . , a £ ) of nj 

points : 

1) If nj<K+l, where K=number chosen predictors, 
set Tj ( , . . . , ) = Xj ( a 1 , . . . , a £ ) . This set is terminal 
because any further splits of the set will lead to 
degeneracy. In fact, if nj<K, set PHIJ=0 for this set since 
in this case only trivial covariance matrices will be found. 

2) If nj>K+l, go to b. 

b. Perform principal component analysis (PCA) of the 
point swarm Xj ( , . . . , a £ ) in . Determine the eigenvalues 
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Z ,...,/, where/ is the largest and / is the smallest. 

1 k k 1 s k 

Let / £. . Compute X = £ /( Z - £ ) . Go to c. 

i=i 1 11 

c. Perform a Monte Carlo experiment to determine if A 

is significantly large. That is, randomly generate a 

duplicate of the data swarm under consideration, normalize 

the data and find the centroid, covariance matrix and 

eigenvalues. Let £ U) ^ £ U) >. . . >/'* be the ordered set of 

1 2 ’ k 

eigenvalues resulting from the ith Monte Carlo experiment, 

(i> (') w _ (i),. (i) (i) . 

i=l,...,100. Let £ ,= ^/. and set \{i) = £ /{£ -£ ). 

j=i J 1 



Arrange the A(i) in ascending order, so that, after 
relabeling, A(1)< A(2)£ . . . < A(96)< . . . A(IOO) . 

1) If / =0, for any J=l,2,3 set PHIJ=0. 

K 

2) If A < A(96) , set Tj(a^,«.., a. g )— Xj(q^, . . . , • 

This is the terminal case. Go to the next swarm awaiting 
decomposition. 



3) If A(96)<A, go to d. 

d. A split is performed by setting 

X J (a 1 ,...,a^={X(I)« X(I)C X J (a i , . . . , a ^) and A^D^Of 

X J (a 1 , . . . ,a^=|X(l) : X(I)G X J (a 1 , . . . , a £ ) and A 1 (I)>0( 

where X(I) is a point (k- tuple) of numbers in E . When 

splits are completed on all levels, we have a set of 

terminal nodes T T ( a , . . . , a £ ) . See Fig. 4. 

J 1 * 

E. FITTING PROBABILITY DENSITY FUNCTIONS TO EACH TERMINAL 
NODE 



Denote the terminal nodes by 'Tj(I)' which is the name 
of the Ith terminal node found by successive splits of Xj in 
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E . (Notes if the terminal node results from a degeneracy 
K 

or from the case £ =0, then no further work is done since 

k 

PHIJ=0 in those cases for all points.) Establishing the 
following notation: 

AVGJ ( I ) =centroid of the terminal swarm of points T (I) 

J 

C (I)=KxK covariance matrix of T (I) 

-J J v ' 

DET ( I ) =determinant of C (I) where 

J k J 

DET (I)=/7/ » l - eigenvalues of C (I). 

J r=i r r -J 

Then the required probability density function is 



-k/2 



PHIJ(I ,X) 5 [2»tJ*[DET j (I)] 2 *EXP[-0.5*(X-AVGJ(I) ) T C “ 1 ( I ) (X-AVGJ(I) ) ] 



where I runs over all terminal nodes associated with 
category subset X 

J 

X is an arbitrary point in E 

K 

X - AVGJ(I) is a k-component column vector in E , 

K 



' T’ denoting transpose 






C fl) is the inverse of the covariance matrix C (I). 
J J 



This results in a set of three probability distributions 
PHIJ(I,X), J=1 ,2, 3 and forms the present model over each 
after suitably assembling these PHIJ(I,X) values. 



F. ASSEMBLING THE PHIJ(I,X) ON EACH X T 

J 

Let n (I) be the number of points in T (I). Then 
My J J 

2^ n T ( I ) =NXJ , the number of points in X 

1=1 J 

where M is the number of terminal nodes arising in X . 

J 

Define 



a J (I)=n J (I)/NXJ. 



Then X a ( I ) = 1 - Set PHIJ(X)= £ a (I)PHIJ(I,X) 

1=1 J 1=1 J 
for J=l,2,3> X in E . This is the desired model. 

“ K 

G. CLASS ERRORS 

These are made from the new versions of PRB(M, J(M) ) 
computed as in section II. E. above. 

H. FINAL SCREENING TESTS FOR CANDIDATE PREDICTOR PX(I,K) 

1 . Using BMDP program P3D compute the P-value for each 
of the three possible pairs of PDF's for the three 
categories. Average these values and find P. 

2. Compare the new PAO and PA1 values with those found 
for the previous run with one less predictor. 

3- Compare the new PAO to the null hypothesis. 

Accept PX(I,K), the Kth candidate predictor, if each of 
the following hold: 

*a. P<.05 

b. PAO (K- 1 )<PA0 ( K) , PA1(K)<PA1(K-1) 

c. PA0>PA0(null) 

Here PAO(null) is the upper limit of the 95% confidence 
interval, as found in Appendix B. 

If these conditions are not fully met, return to section 
III. A. and select the next potential predictor PW(I,KW) in 
line, until all potential predictors have been considered. 



"In the original version of PDM [Preisendorf er , 1984], 
this step uses the potential predictability (PP) criterion. 
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Once the model is finished, that is all potential 
predictors have been considered, then compute the actual 
A0(I) and A 1 ( I ) scores using the testing set. 



IV. EXAMPLE 



An example is helpful in understanding how the method 
works in practice. The results presented here were obtained 
by applying the method to a set of 200 points taken from the 
Area 2, TAU-24 data set. The example will extend through 
one level of the multi-predictor stage, i.e., through the 
selection and acceptance of a second potential predictor. 

A. SELECTION OF FIRST PREDICTOR 

The first step towards identifying the first predictor 
is to run the BMDP Statistical Software program P7D 
[University of California, 1983 ] » to find the average 
P-value for each predictor. In this case, there were 
several predictors for which the average P-value was 0.0. 
Therefore, (see note 2 of II. F.) the results showing the 
PAO, PA1, AO and A1 scores for each predictor (if used as 
the first predictor) had to be consulted before the choice 
was made. The chosen predictor was E 850 because of all the 
predictors with an average P-value of 0.0, it had the 
largest PAO score (,51)« 



58 



Predictor E 85 O was then correlated with the remaining 
potential predictors. The potential second predictor chosen 
was DEDP because it had the smallest correlation coefficient 
when correlated with E 850 . 

B. THE SECOND PREDICTOR STAGE 

With the first predictor chosen and a candidate second 
predictor ready for consideration, it was necessary to begin 
the principal component analysis (PCA) of the data swarm in 
anticipation of creating the probability density functions. 

When broken into the three categories corresponding to 
the visibility groupings, the categorized data sets 
contained the following numbers of points: 

XCAT1 — 14 
XCAT2— 14 
XCAT3--172 

The decomposition of the first category subset (XCAT1) 
will be explained in detail, since it is small. Fig. 5 
presents a pictorial representation of the following steps 
in the K=2 (predictor) stage: 

1. Consider XCAT1 first. For this swarm, A>A( 96 ). 
Therefore, the swarm must be split using PCA. The two new 
sets are X^(0) with 6 points and X^(l) with 8 points. 

2. Consider the swarm X^(0). Since A<A(96) in this 
case, the set is terminal, i.e., Tj=X^ (0). No further 
decomposition is performed on this set. 
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3. Consider the swarm X-^(l). Here, A>A(96), so the swarm 
is further decomposed into X^(10) with 5 points and X^(ll) 
with 3 points. 

4. Next X^(10) is considered. Since A>\(96), this swarm 
is further decomposed into X^(100) with 4 points and X^(101) 
with 1 point. 

5« Next X]_(ll) is considered. Since this swarm has only 
3 values, and 3 K+l , this swarm is terminal. Thus, 

t 2 =x 1 (u). 

6. The data swarm Xj(lOO) with 4 points is found to have 
A >\(96) and therefore, it is terminal. Set T^=X^(100). 

7. The set X^(101) has only 1 point. Since 1 K, this 
set is degenerate. Although T^=X^(101), for this terminal 
set PHIJ=0 for all values of X. Therefore, it is not 
considered when building the probability density functions. 

Thus for XCAT1 , there are three useable terminal sets. 
Similarly, there are two for XCAT2 and fourteen for XCAT3- 
Once the PHIJ's are formed and probabilities computed, 
potential class errors are computed and compared to the 
potential errors found at the one predictor level. The new 
PAO (. 67 ) is greater than at the one-predictor level (. 51 ) 
and the new PA1 (.27) is lower than at the one predictor 
level (.39)* With part of the selection criteria satisfied, 
the average P-value using both predictors was found using 
BMDP Statistical Software program P3D [University of 
California, 1983 ]* Since the average P-value (0) met the 
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significance criteria of being less than . 05 , a second 
requirement towards acceptance of the second predictor was 
met. Since PAO (. 67 ) was greater than PAO (null ) = . 40 , the 
third criteria was met and the second predictor DEDP was 
accepted . 
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APPENDIX B 



NULL HYPOTHESIS SIGNIFICANCE TESTING 



Following the work of Diunizio (1984a), Mr. Paul Lowe of 
NEPRF proposed that statistics such as AO and threat scores 
could be assigned normal probability distributions and, 
therefore, be subject to Null Hypothesis significance 
testing criteria. The assignment of the normal probability 
distributions is based upon the Central Limit Theorem. 
Diunizio (1984b) explored this technique and presented the 
subsequent results. This appendix presents the equations 
used in this study for significance testing. 

When using three visibility categories, the null 
hypothesis is that the percentage correct will be .333 if 
only chance is involved. Using a 95%° confidence test, we 
want to create an interval around the null hypothesis value 
such that values outside are considered to be significant. 
Let P 0 =. 333 

n=number of values in data set 
z a/ 2 = 1*96 for 95%> confidence interval 
(1 - a = .95, a /2= .025) 

then AA=P 0 - z a/2 [P Q (1 - P Q )/n] 

BB=P 0 + z a/2 [P Q (1 - P Q )/n] 

where AA is the lower limit and BB is the upper limit of the 
confidence interval. 
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APPENDIX C 



WORLD METEOROLOGICAL ORGANIZATION 
HORIZONTAL SURFACE VISIBILITY CODES 



CODE 


VISIBILITY (KM) 


90 


<0.05 


91 


0.05 


92 


0.2 


93 


0.5 


94 


1.0 


95 


2.0 


96 


4*. 0 


97 


10.0 


98 


20.0 


99 


50.0 or more 



Notes The values given are discrete values (i.e., not 
ranges). If the observed visibility is between two 
reportable distances as given in the table, the code figure 
of the lower reportable distance shall reported. 
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APPENDIX D 



SKILL AND THREAT SCORES , DEFINITIONS (Karl, 1984) 



E-i 

m 

< 

o C\] 

w 

o 



1 2 3 
OBSERVED 

Total = R + S + T + U + V + W+ .X + Y Z 
PI = (R+U+X)/Total P3 = ( T+W+Z ) /Total 

P2 = (S+V+Y)/Total PN = greatest of PI, P2 or P3 

Raw Scores 

AO = fraction correct = zero-class error = (X+V+T)/total 
A1 = one-class error = (U+S+Y+W)/Total 
A2 = two-class error = (R+Z)/Total 
AO + A1 + A2 = 1 

TS1 = Threat score for visibility category I 
= X/ ( R+U+X+Y+Z ) 

TS2 Threat score for visibility category II 
= V/( S+V+Y+U+W ) 



R 


S 


T 


U 


V 


W 


X 


Y 


z 
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TS12 = Threat score for visibility categories I and II 
= (X+V)/ (Total-T) 

TS12 is designed to represent the skill of forecasting 
visibility categories I and II as separate categories, 
rather than their skill as a combined category, which would 
be (U+V+X+Y)/Total-T) . 

Adjusted scores 
AAO = ( AO-PN ) / ( 1 -PN ) 

ATS1 = (TS1-P1 )/( 1-PI ) 

ATS2 = (TS2-P2)/(1-P2) 

ATS 12 = (TS12-(P1+P2) )/ ( 1- ( P1+P2 ) ) 
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APPENDIX E 



NOGAPS PREDICTOR PARAMETERS AVAILABLE FOR 
NORTH ATLANTIC OCEAN EXPERIMENTS 



I. Area: North Atlantic Ocean and Mediterranean Sea 

Model output time: 1200GMT (TAU-OO, TAU-24, TAU-48) 

1 5 May- -7 July 1983 

Legend: * Parameters which were not used because they 



were considered physically unrelated 
to marine visibility. 



** Parameters which were not used due to loss 
of significant digits during transfer from 
tape to mass storage. 

*** Parameters existing for TAU-24 and TAU-48 
only. 



A. Model output Descriptive name of parameter 
parameter 



D850 



D500 



D925 



D1000 



D700 



TAIR 



D400 



D300 



D250 



1000 mb geopotential height 
925 mb geopotential height 
850 mb geopotential height 
700 mb geopotential height 
500 mb geopotential height 
400 mb geopotential height 
300 mb geopotential height 
250 mb geopotential height 
Surface air temperature 



T1000 



1000 mb temperature 
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T925 


925 mb temperature 


T700 


700 mb temperature 


T500 


500 mb temperature 


T400 * 


400 mb temperature 


T300 * 


300 mb temperature 


T250 * 


250 mb temperature 


EAIR 


Surface vapor pressure 


El 000 


1000 mb vapor pressure 


E925 


925 mb vapor pressure 


E850 


850 mb vapor pressure 


E700 


700 mb vapor pressure 


E500 


500 mb vapor pressure 


UBLW 


Boundary layer zonal wind component 


U1000 


1000 mb zonal wind component 


U925 


925 mb zonal wind component 


U850 


850 mb zonal wind component 


U700 


700 mb zonal wind component 


U500 


500 mb zonal wind component 


U400 * 


400 mb zonal wind component 


U 300 * 


300 mb zonal wind component 


U 250 * 


250 mb zonal wind component 


VBLW 


Boundary layer meridional wind 
component 


V1000 


1000 mb meridional wind component 


V92 5 


925 mb meridional wind component 
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V 850 


850 mb meridional wind component 


V 700 


700 mb meridional wind component 


V500 


500 mb meridional wind component 


V 400 * 


400 mb meridional wind component 


V300 * 


300 mb meridional wind component 


V250 * 


250 mb meridional wind component 


VOR925 ** 


925 mb vorticity 


V 0 R 500 ** 


500 mb vorticity 


PS 


Surface pressure 


SMF 


Surface moisture flux 


PBLD 


Planetary boundary-layer depth 


STRTFQ 


Percent stratus frequency 


STRTTH 


Stratus thickness 


SHF 


Surface heat flux 


ENTRN 


Entrainment at top of marine 
boundary-layer 


DRAG ** 


Drag coefficient (C^) 


PRECIP *** 


Total amount (mm) of model 
precipitation in the last six hours 


SHWRS *** 


Total amount (mm) of model precipita- 
tion associated with cumulus convection 
in the last six hours 


INSTAB *** 


Boundary layer inversion instability 


DIV 925 *** 


925 mb divergence 
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B. Derived Parameters 



DTDP 

DEDP 

DUDP 

DYDP 

RH 

TV 

DDVDP 

DVRTDP ** 

DUUPDP 

DVUPDP 

ESUM 

EPRD 

EDIF 



Vertical gradient of temperature 
( 1000-925 mb) 

Vertical gradient of vapor pressure 
(1000-850 mb) 

Vertical gradient of zonal wind 
(1000-850 mb) 

Vertical gradient of meridional wind 
(1000-850 mb) 

Surface relative humidity 
Virtual temperature 
Vertical gradient of geopotential 
height (1000-850 mb) 

Vertical gradient of vorticity 
( 500-925 mb) 

Vertical gradient of zonal wind 
( 300-500 mb) 

Vertical gradient of meridional wind 
( 300-500 mb) 

Sum of vapor pressures 
(1000 & 850 mb) 

Product of vapor pressures 
(1000 & 850 mb) 

Difference of vapor pressures 
(1000-850 mb) 
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TABLE V. A summary of skill scores obtained for dependent and independent 
data sets using the PR [Karl, 1984; Diunizio (1984)] and PDM 
methods on FAT JUNE 1983 data from North Atlantic Ocean homo- 
geneous areas 2, 3 W and 4: TAU-OO, TAU-24, TAU-48. 
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DEPENDENT INDEPENDENT 

Area/TAU Method AO A1 TS1 TS2 TS12 AO A1 TS1 TS2 TS12 
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Fig. 1. Homogeneous areas for the North Atlantic Ocean, 
May, June and July, from Lowe (1984b). 
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DISCRIMINANT TRAINING SET 





Fig, 2, Distribution diagrams for a sample training set 
where (a) shows the vertical stacking of 
observations; (b) shows the (a) data in their 
category discriminant sets; (c) shows the analytic 
representation of the data in (a). 



7 ? 



2nd PREDICTOR AXIS 




1st PREDICTOR AXIS 

Fig. 3* A general representation of Xj in the case of 
two predictors, from Preisendorfer (1984). 
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Fig. 4. Schematic of the binary decomposition of a category 
subset, from Preisendorfer (1984). 




Terminal-T^ Terminal -T^ 



Fig. 5* Schematic of the binary decomposition of a sample 
set from area 2, TAU-24. 
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Fig. 6. Skill diagram and contingency table results for 

FATJUNE 1983, North Atlantic Ocean area 2, TAU-OO, 
PDM model. 



Fig. 7 



NUMBER OE PREDICTORS VS. PBO SCORES 
AREA 2iA,e,C) - TAU24 - 3 VI SCATS 

PREDICTORS C850, OCOP, OVOP, ENTRN, SHWRS, OUOP, D92S 




Comparison of PAO scores for FATJUNE 1 98 3 , North 
Atlantic Ocean area 2(A,B,C), TAU-24, PDM model. 
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NUMBER OP PREDICTORS VS. DEP AO SCORES 



AREA 210,8,0 - TAU24 - 3 VISCATS 

PREDICTORS E8S0, OEOP, DVOP, ENTRN, SHWRS, DUOP, 0925 
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Fig. 8. Comparison of DEP AO scores for FATJUNE 1983* 

North Atlantic Ocean area 2(A,B,C), TAU-24, PDM 
model . 
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NUMBER OF PREDICTORS VS. INO AO SCORES 
AREA 2(fl,B,C) - TAU24 - 3 VI SCATS 

PKEDICJORS CS50, OCOP, OVOP, ENtRN, SHURS, OUOP, 092S 




Fig. 9* Comparison of IND AO scores for FATJUNE 1983* North 
Atlantic Ocean area 2(A,B,C), TAU-24, PDM model. 
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NUMBER OP PREDICTORS VS. SCORING TECHNIQUES DEPENDENT data 

AREA 2R - TAU24 - 3 VI SCATS 
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Skill diagram and contingency table results for 
FATJUNE 1983, North Atlantic Ocean area 2(A), 
TAU-2A, PDM model- 



NUMBER OF PREDICTORS VS. SCORING TECHNIQUES DEPENDENT DATA 

BRER 26 - TRU24 - 3 VI SORTS 
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Skill diagram and contingency table results for 
FATJUNE 1983, North Atlantic Ocean area 2(B), 
TAU-24, PDM model. 



NUMBER OP PREDICTORS VS. SCORING TECHNIQUES 
AREA 2C - TAU24 - 3 VI SCATS 
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Skill diagram and contingency table results for 
FATJUNE 1983, North Atlantic Ocean area 2(C), 
TAU-24, PDM model. 



NUMBER OP PREDICTORS VS. PRO SCORES 
AREA 2A - TAU24 * 3 VISCATS - CRITERIA TESTS 
PREDICTORS EB50, OEOP, DVDP, ENTRN, SHWRS, DUDP, D92S 




Fig. 13. Comparison of PAO scores for FATJUNE 1 983 » North 
Atlantic Ocean area 2(A), TAU-24, PDM model 
criteria tests. 
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NUMBER OP PREDICTORS VS. DEP RO SCORES 



flRER 2fl - TRU24 - 3 VISCfiTS - CRITERIR TESTS 

PREDICTORS EB50, OEDP, DVOP, ENTRN, SHWRS, DUOP, 0925 
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Fig. 14. Comparison of DEP AO scores for FATJUNE 1 983 , 
Atlantic Ocean area 2(A), TAU-24, PDM model 
criteria tests. 
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Fig. 1 



NUMBER OP PREDICTORS VS. IND BO SCORES 
AREA 2A - TAU24 - 3 VI SCATS - CRITERIA TESTS 

PREOICTORS EBSO, OEOP, DVDP, ENTRN , SHWRS, OUOP, 0925 




. Comparison of IND AO scores for FATJUNE 1983. 
Atlantic Ocean area 2(A), TAU-2U, PDM model 
criteria tests. 
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NUMBER OP PREDICTORS VS. SCORING TECHNIQUES DEPENDENT DATA 

BRER 2A - TAU24 - 3 VI SCATS - LAMBDA (98) , „ 
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NUMBER OE PREDICTORS VS. SCORING TECHNIQUES 

AREA 2A - TRU24 - 3 VI SCATS - LAMBDA PRIME 
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Skill diagram and contingency table results for 
FATJUNE 1983, North Atlantic Ocean area 2(A), 
TAU-24, PDM model lambda prime criteria test. 
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FATJUNE 1983. North Atlantic Ocean 
TAU-24, Case X, PDM model. 



NUMBER OP PREDICTORS VS. PRO SCORES 
AREA 2 - TAU24 - 2 VISCATS - CASE Y(A,B,C) 
PREDICTORS E850, DEOP. DVDP, ENTRN, SHWRS 




Fig. 19 . Comparison of PAO scores for FATJUNE 1983, North 
Atlantic Ocean area 2, TAU-24, Case Y(A,B,C), 

PDM model. 
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Fig. 20. 



NUMBER OF PREDICTORS VS. DEP flO SCORES 
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Comparison of DEP AO scores for FATJUNE 1983* 
Atlantic Ocean area 2, TAU-24, Case Y(A,B,C), 
PDM model. 
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NUMBER OF PREDICTORS VS. IND RO SCORES 
ARE FI 2 - TRU24 - 2 VI SCATS - CASE Y(A,8,C) 
PREDICTORS E850, OEOP, OVOP, ENTRN, SHWRS 




Fig. 21. Comparison of IND AO scores for FATJUNE 1 98 3 » 
Atlantic Ocean area 2, TAU-24, Case Y(A,B,C), 
PDM model. 
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NUMBER OP PREDICTORS VS. SCORING TECHNIQUES DEPENDENT DATA 

AREA 2 - TAU24 - 2 VI SCATS - CASE Y( A) 

PREDICTORS E850, OEOP, OVOP, ENTR N, SHWRS , . . , , 
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Fig. 22. Skill diagram and contingency table results for 

FATJUNE 1983 1 North Atlantic Ocean area 2, TAU-24 
Case Y(A), PDM model. 
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Fig. 23 . Skill diagram and contingency table results for 

FATJUNE 1983 . North Atlantic Ocean area 2, TAU-24 
Case Y(B), PDM model. 
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Fig. 24. Skill diagram and contingency table results for 

FATJUNE 1983, North Atlantic Ocean area 2, TAU-24 
Case Y(C), PDM model. 
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NUMBER OP PREDICTORS VS. SCORING TECHNIQUES 
AREA 3W - TAUOO - 3 VI SCATS 
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Fig* 27* Skill diagram and contingency table results for 

FATJUNE 1983» North Atlantic Ocean area JV1 , TAU-24, 
PDM model. 
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DEPENDENT DATA 
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NUMBER OE PREDICTORS VS. SCORING TECHNIQUES 

RREfl 4 - TRU24 - 3 VI SCATS 
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NUMBER OP PREDICTORS VS. SCORING TECHNIQUES DEPENDENT DATA 

AREA T - TAU48 - 3 VI SCATS 
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